Parameter Space Noise for Exploration
نویسندگان
چکیده
Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space. An alternative is to add noise directly to the agent’s parameters, which can lead to more consistent exploration and a richer set of behaviors. Methods such as evolutionary strategies use parameter perturbations, but discard all temporal structure in the process and require significantly more samples. Combining parameter noise with traditional RL methods allows to combine the best of both worlds. We demonstrate that both offand on-policy methods benefit from this approach through experimental comparison of DQN, DDPG, and TRPO on high-dimensional discrete action environments as well as continuous control tasks. Our results show that RL with parameter noise learns more efficiently than traditional RL with action space noise and evolutionary strategies individually.
منابع مشابه
Exploring parameter space in reinforcement learning
This paper discusses parameter-based exploration methods for reinforcement learning. Parameter-based methods perturb parameters of a general function approximator directly, rather than adding noise to the resulting actions. Parameter-based exploration unifies reinforcement learning and black-box optimization, and has several advantages over action perturbation. We review two recent parameter-ex...
متن کاملOnline State Space Model Parameter Estimation in Synchronous Machines
The purpose of this paper is to present a new approach based on the Least Squares Error method for estimating the unknown parameters of the nonlinear 3rd order synchronous generator model. The proposed method uses the mathematical relationships between the machine parameters and on-line input/output measurements to estimate the parameters of the nonlinear state space model. The field voltage is...
متن کاملApplication of Single-Frequency Time-Space Filtering Technique for Seismic Ground Roll and Random Noise Attenuation
Time-frequency filtering is an acceptable technique for attenuating noise in 2-D (time-space) and 3-D (time-space-space) reflection seismic data. The common approach for this purpose is transforming each seismic signal from 1-D time domain to a 2-D time-frequency domain and then denoising the signal by a designed filter and finally transforming back the filtered signal to original time domain. ...
متن کاملOptimal aeroacoustic shape design using the surrogate management framework
Shape optimization is applied to time-dependent trailing-edge flow in order to minimize aerodynamic noise. Optimization is performed using the surrogate management framework (SMF), a non-gradient based pattern search method chosen for its efficiency and rigorous convergence properties. Using SMF, design space exploration is performed not with the expensive actual function but with an inexpensiv...
متن کاملWIZER: What-If Analyzer for Automated Social Model Space Exploration and Validation
_________________________________________________________________ Complex social problems modeled by multi-agent systems have very large parameter and model space. The problem of how to model, validate, detect, and plan for the event of bioterrorism is one of the these, as it requires faithful modeling of dynamic signal (bioattack event) from complex dynamic noise (normal disease outbreaks and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1706.01905 شماره
صفحات -
تاریخ انتشار 2017